-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgraded concurrency framework to support panic=unwind mode #16
Conversation
async fn replica_state(&self, ctx: &ctx::Ctx) -> StorageResult<Option<ReplicaState>> { | ||
scope::wait_blocking(ctx, || { | ||
async fn replica_state(&self, _ctx: &ctx::Ctx) -> StorageResult<Option<ReplicaState>> { | ||
scope::wait_blocking(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've discussed this previously, but it could be worth bringing up another time. If the context is canceled, wouldn't it make sense to immediately return a cancellation error from wait_blocking
, even if the provided closure runs to completion (because we cannot stop it)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't changed the semantics - scope waits for all tasks to finish no matter if they ignore the ctx cancellation or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaict, our use cases for this function are about running a blocking function in non-blocking context rather than allowing things to run after context cancellation. In general it is a bug to spawn a blocking context-unaware method from within a scope (it is against structured concurrency principle), so I don't want to facilitate such an antipattern (with the current semantics a context-unaware function should cause a deadlock which will propagate up the stack and should be relatively easy to notice). In case of rocksdb access it is ok-ish, because it is expected to finish in a bounded amount of time (in fact all filesystem accesses in unix don't have a proper notion of cancellability).
We are going to use era-consensus code in zksync-era, which doesn't have panic=abort enabled for the time being. This PR adjust the concurrency framework so that it behaves in the expected way in the presence of panics: